DFA Learning of Opponent Strategies

نویسندگان

  • Gilbert L. Peterson
  • Diane J. Cook
چکیده

This work studies the control of robots in the adversarial world of “Hunt the Wumpus”. The hybrid learning algorithm which controls the robots behavior is a combination of a modified RPNI algorithm, and a utility update algorithm. The modified RPNI algorithm is a DFA learning algorithm, to learn opponents’ strategies. An utility update algorithm is used to quickly derive a successful conclusion to the mission of the agent using information gleaned from the modified RPNI.1

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Avg. Model Size vs. Changes No. Model Size Error vs. Changes No

model of its opponent's strategy based on its past behavior, and uses the model to predict its future behavior. We represent an interaction between agents by a repeated-game and restrict our attention to opponent strategies that can be represented by DFA. Learning a minimal DFA without a teacher was proved to be hard. We presented an unsu-pervised algorithm, US-L , based on Angluin's L algorith...

متن کامل

Using a Priori Information for Fast Learning Against Non-stationary Opponents

For an agent to be successful in interacting against many different and unknown types of opponents it should excel at learning fast a model of the opponent and adapt online to non-stationary (changing) strategies. Recent works have tackled this problem by continuously learning models of the opponent while checking for switches in the opponent strategy. However, these approaches fail to use a pr...

متن کامل

Winning Opponent Counter Strategy Selection in Holdem Poker

The game of poker presents an interesting and complex problem for game theorists and researchers in machine learning. Current work on the subject focuses on how to develop optimal counter strategies, often referring to the Upper Confidence Bounds (UCB1) algorithm to determine which of these counter strategies is optimal for an unknown opponent. We present a new method for taking a learned set o...

متن کامل

Combining Opponent Modeling and Model-Based Reinforcement Learning in a Two-Player Competitive Game

When an opponent with a stationary and stochastic policy is encountered in a twoplayer competitive game, model-free Reinforcement Learning (RL) techniques such as Q-learning and Sarsa(λ) can be used to learn near-optimal counter strategies given enough time. When an agent has learned such counter strategies against multiple diverse opponents, it is not trivial to decide which one to use when a ...

متن کامل

Using iterated reasoning to predict opponent strategies

The field of multiagent decision making is extending its tools from classical game theory by embracing reinforcement learning, statistical analysis, and opponent modeling. For example, behavioral economists conclude from experimental results that people act according to levels of reasoning that form a “cognitive hierarchy” of strategies, rather than merely following the hyper-rational Nash equi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998